Goto

Collaborating Authors

 taco dataset


Domain-Independent Automatic Generation of Descriptive Texts for Time-Series Data

Dohi, Kota, Ito, Aoi, Purohit, Harsh, Nishida, Tomoya, Endo, Takashi, Kawaguchi, Yohei

arXiv.org Artificial Intelligence

Due to scarcity of time-series data annotated with descriptive texts, training a model to generate descriptive texts for time-series data is challenging. In this study, we propose a method to systematically generate domain-independent descriptive texts from time-series data. We identify two distinct approaches for creating pairs of time-series data and descriptive texts: the forward approach and the backward approach. By implementing the novel backward approach, we create the Temporal Automated Captions for Observations (TACO) dataset. Experimental results demonstrate that a contrastive learning based model trained using the TACO dataset is capable of generating descriptive texts for time-series data in novel domains.


TACO: Topics in Algorithmic COde generation dataset

Li, Rongao, Fu, Jie, Zhang, Bo-Wen, Huang, Tao, Sun, Zhihong, Lyu, Chen, Liu, Guang, Jin, Zhi, Li, Ge

arXiv.org Artificial Intelligence

We introduce TACO, an open-source, large-scale code generation dataset, with a focus on the optics of algorithms, designed to provide a more challenging training dataset and evaluation benchmark in the field of code generation models. TACO includes competition-level programming questions that are more challenging, to enhance or evaluate problem understanding and reasoning abilities in real-world programming scenarios. There are 25433 and 1000 coding problems in training and test set, as well as up to 1.55 million diverse solution answers. Moreover, each TACO problem includes several fine-grained labels such as task topics, algorithms, programming skills, and difficulty levels, providing a more precise reference for the training and evaluation of code generation models. The dataset and evaluation scripts are available on Hugging Face Hub (https://huggingface.co/datasets/BAAI/TACO) and Github (https://github.com/FlagOpen/TACO).